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The dynamics of folding of proteins is studied by means of 
a phenomenological master equation. The energy distribution 
is taken as a truncated exponential for the misfolded states 
plus a native state sitting below the continuum. The influence 
of the gap on the folding dynamics is studied, for various 
models of the transition probabilities between the different 
states of the protein. We show that for certain models, the 
relaxation to the native state is accelerated by increasing the 
gap, whereas for others it is slowed down . 



I. INTRODUCTION 

Proteins are known to fold into a unique native state 
that is biologically active jij. Despite their complexity 
and frustrated character, their folding rate is fast and 
the folding times vary from milliseconds to seconds. How- 
ever, these times are long compared to the usual collapse 
time of a homopolymer (a few microseconds) This 
means that the diversity of the aminoacids and the re- 
sulting frustration play a crucial role in the slowing down. 
The main picture that has emerged to describe this type 
of dynamics is that of a "funnel- like" phase space ||: 
the folding pathway of a protein in phase space is along 
a unique but rough funnel towards its ground state. How- 
ever, so far, this picture is not based on microscopic mod- 
els. 

Numerical simulations that have been dealing with lat- 
tice models of short disordered polymers tend to support 
this funnel picture. A number of these simulations seem 
to show that "good folders" (sequences that fold rapidly 
to their ground state, and are thus good candidates to 
represent real proteins) indeed follow a "funnel" in the 
energy landscape, whereas "bad folders" possess a large 
number of energetic traps along the dynamical path that 
slow down the folding process . 

Several studies have tried to relate the foldability of 
model proteins with their energy gap. It has been ar- 
gued in ref. [|| that model proteins which have a large 
energy gap between the native state and the first confor- 
mationally different (compact) state fold rapidly. On the 



*e-mail:pitard@spht. saclay.cea.fr 



other hand, other studies |Q have shown that the param- 
eter which governs the rapidity of folding is the distance 
of the folding temperature to the collapse temperature. 

The aim of this paper is to show that the situation 
is not so simple, and that in particular, the folding rate 
depends very much on the dynamics used for the simula- 
tions. In particular, we show that for certain transition 
rates, a large gap accelerates the folding, whereas for 
other models, it may slow it down. 

II. MODELIZATION BY A TRUNCATED REM 

Many models have stressed the analogy between pro- 
tein folding and the thermodynamics of heteropolymers 
||. In a mean field approach, some models of quenched 
disordered polymers are similar to a Random Energy 
Model (REM) §,|l§; the low lying states of the pro- 
tein are identified with the low-lying states of the REM, 
which are responsible for the slow dynamics. Although 
this analogy is questionable as far as real proteins are 
concerned pl|,^2[ , we will adopt this framework for the 
dynamical models studied in the following. 

For the REM as well as for the Sherrington Kirkpatrick 
(SK) model |l3|] , it has been shown that i) all low lying 
states have the same extensive energy and ii) the dis- 
tribution of the corrections to extensivity of their free 
energies is exponential (bound at high energy but not at 
low energy). 

For proteins, the energies are bound below by that of 
the native state and above by that of a swollen coil. In ad- 
dition, the energy landscape is known to be very rugged 
and the number of misfolded states (at fixed energy E) 
is known to grow very rapidly with E. It is thus natural, 
by analogy with many disordered systems, to assume an 
exponential distribution for the energies of the lowest ly- 
ing misfolded states and to isolate the native state below 
the continuum of energies. 

In our model, we describe the phase space of the 
protein as consisting of M misfolded states E a with 
a = 1, . . . , M and one native state Eq. The distribution 
of energies of the misfolded states is continuous, given 
by: 

p(E) = p c e^ E - E ^ 

where (3 C is a parameter (related to the glass transition 
temperature of the REM) and E c is the energy of the 
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highest state. 

More precisely, the energy levels are such that E min < 
E < E c , where E m i n is the energy of the first misfolded 
state. Therefore, Boltzmann weights B a = e~@ Ea vary 
between B min — e~@ Ea and B max = e ~ l3Emin . To ac- 
count for the native state, we include an additional state 
with energy Eq and Boltzmann weight Bq. 

The energy gap A is defined by A = E min — Eq > 0. 

The main consequence of the truncation of the energy 
distribution is that the system will never spend a very 
large time in one of the energy traps, and that P a {t) 
will always relax towards its equilibrium value P^ q with 
a finite relaxation time at large times. Slow dynamics 
features such as aging won't appear either; this is in con- 
trast with the case when the distribution is not truncated, 
which has been studied extensively recently |f4|] . 

The calculations are made by taking first the limit 
M — > oo before taking the limit of large times. This 
is justified if the number of metastable conformations is 
large enough. 

The dynamics is based on the master equation: 
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where P a (t) is the probability of occupation of the energy 
level E a at time t , and W a p is the transition rate from 
the energy level Ep to the energy level E a . In all the 
cases we have studied, detailed balance is satisfied: 
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When solving the master equation, the quantities that 
are usually calculated are averages over the distribution 
of disorder p(E). This is justified in the case of a macro- 
scopic system with short range interactions, but not in 
the case of a protein, which is too small an object for 
self- aver ageness to hold. 

This is why, in contrast with other studies, we have 
calculated quantities which are not averaged over the 
distribution of disorder. We considered three cases, de- 
pending on the choice of transition rates between energy 
levels. We find that if the transitions rates depend only 
on the final state, the relaxation is accelerated by a large 
gap whereas if they only depend on the initial state, the 
dynamics is slowed down; in the intermediate case, the 
situation is more complex. 



disorder-averaged quantities show stretched exponential 
or power law behavior at large times Jl5| , ^6[ . The same 
models have been used in ref. jL7| in the context of het- 
eropolymer folding. 



A. Case where the transition rates depend only on 
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If Eq and a gap are included in the analysis, the master 
equation can easily be solved, leading to an exponential 
behavior, P a {t) = + (P a {0) - ^^)e~ zt , where 

z = E Q B a . Since Z ~ B + MB a = B + j^B max 
(the bar denotes the average over the distribution of en- 
ergies), where Bo = B max e^ A , the relaxation time is 

Let us compare the dynamics of two protein sequences 
that differ only by their native energies Eq keeping E m i n 
(or B max ) constant. As seen above, the relaxation time 
r decreases when the gap A increases. Indeed, in such 
a dynamical scheme, jumps towards the native state are 
enhanced and the lower its energy, the faster it is pop- 
ulated. The dynamics is illustrated by Fig.|j] where the 
probability of occupation of the native state Po(t) is plot- 
ted for different values of the gap. On FigJ|, we show the 
relaxation time as a function of the gap. 




FIG. 1. Po(t) in the case where W a 
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for three 



sequences with the same distribution of energies but different 
gaps. From top to bottom, A = 1.5, A = 1, A = 0.5, /3 = 1.5 
and M = 100. The dynamics is faster for larger values of the 
gap. 



III. RESULTS 



In this section, we present the results of the calcula- 
tions, which will be detailed elsewhere. 

Studies as the ones presented below have already 
been made for a non truncated distribution of energies: 
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FIG. 2. Relaxation time r(A) as a function of the gap A 



in the case where W a t 



with p = 1.5 and j3 c 



B. Case where the transition rates only depend on 
the initial state: W a ,3 = ^eP Ep ■ 

As was noted in ref . jlTj , this corresponds to the case of 
a unique barrier at energy E* , through which the system 
must pass in order to make a transition from state a to 
13. 

Calculations can be made in both cases where the ini- 
tial conditions are uniform (all states are equally popu- 
lated) or delta-like (the initial probability of occupation 
of a specific state is one) . They lead to the same conclu- 
sions concerning the relaxation times, and we give here 
the results for the case of uniform initial conditions. At 
short times, the probability of occupation of states fol- 
lows a power-law: 
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At longer times, the relaxation is exponential. For a non- 
native state 
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becomes more difficult to escape from this state; low en- 
ergy states are populated quite rapidly, and the system 
remains trapped in the native state; as a result, the re- 
laxation to the Boltzmann distribution is slowed down. 
This effect is more pronounced as the energy gap gets 
bigger (see Fig.|| and ||). 
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FIG. 3. P (t) in the case where W a p = e 0E P for three 
sequences with the same distribution of energies but different 
gaps. From top to bottom, A = 3, A = 2, A = 1. = 1.5 
and M — 100. The dynamics slows down for large values of 
the gap. 



FIG. 4. Relaxation time r(A) as a function of the gap A 
in the case where W a g = e? Bfl , with (3 — 1.5 and f3 c = 1. 



These expressions involve two time constants: r Q 
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which is the relaxation time of the energy level a 
in the absence of a gap, and tq = •g = f3A . This 

last relaxation time is the one that governs the long time 
dynamics of the native state. 

Contrarily to the previous case, keeping E m i n (or 
Bmax) fixed, To now increases as A increases. In this 
scheme, the energy landscape can be viewed as a col- 
lection of energy traps; as the energy E decreases, it 



C. Intermediate case: W, 
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In this case, the transition rates depend both on the 
initial and final state through the above formula, where A 
is a parameter between and 1 . Relaxation to the native 
state is described by a new relaxation time tq: 
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where Z = J2 a B 1 a - x . 

The dependence of tq on the gap is now slightly more 
complicated. One can distinguish two regimes: 



1. if A <C l-x and A < ~ 



t decreases as A increases 
(see Fig. |]and ^J), and the relaxation is accelerated. 
On the other end, if A > i, tq increases with A for 
A < A and decreases with A for A > A , where 
/3 A = Iog(g^ 
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FIG. 5. P (t) in the case where W a0 = e -0((i-A)Sa-A^) 
for four sequences with the same distribution of energies but 
different gaps. From top to bottom, A = 2, A = 1, A = 0.75, 
A = 0.5. We have chosen A = 0.5, (3 = 2.5 and M = 100. 
For this set of parameters, the dynamics is faster as the gap 
increases. 




FIG. 6. Pi(t) in the case where W a p = e -0«i-X)E at -\E fl ) 
for four sequences with the same distribution of energies but 
different gaps. From top to bottom, A = 0.5, A = 0.75, 
A = 1, A = 2. We have chosen A = 0.5, = 2.5 and 
M = 100. 



IV. DISCUSSION 

It seems difficult to compare our results with experi- 
ments in real proteins, since no systematic study of the 
gap of given sequences has been carried out experimen- 
tally. However, there has been a number of lattice simula- 
tions that lead to a variety of results and interpretations. 
Klimov and Thirumalai M claim that there seems to be 
no direct correlation between the gap and the folding dy- 
namics. 

On the other hand, Sali, Shakhnovich and Karplus || 
have shown that the "best" folders are those with the 
largest energy gap. In all cases, the simulations are per- 
formed on very short chains (27 monomers at best) and 
the interactions between monomers are random with a 
Gaussian distribution. The energy of a configuration is 
given by 



> r j) B i. 



where A(rj,rj-) = 1 if monomers i and j are neighbors 
on the lattice and 
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with a negative parameter Bq in order to mimic the hy- 
drophobic character of the solvent. These previous stud- 
ies are applicable only to very short proteins. For longer 
chains, there are no results indicating how a protein will 
find its folding path through its complicated energy land- 
scape. Some attempts have been made to understand 
analytically the dynamics of random heteropolymers |is[] 
and at present, the debate regarding the interpreta- 
tion of the simulations j| is still open. 

Our phenomenological approach based on a REM en- 
ergy landscape, although far from realistic, might cap- 
ture the long time relaxation laws of random chains. The 
finiteness of objects such as proteins is taken into account 
by truncating the energy spectrum. Moreover, we show 
that the dynamics depends on the whole energy land- 
scape, rather that only on one parameter, such as the 
gap. 

According to the type of transition rates used in the 
dynamics, one can obtain different behaviors for the re- 
laxation as a function of the gap. It seems difficult to 
decide which one is more realistic. It would be interest- 
ing to have systematic results on the dynamics of synthe- 
sized sequences, generated for example through mutation 
experiments 21 1. 



2. if A 3> 1 — x, To is an increasing function of A for 
A < Ao and a decreasing function of A for A > Ao , 

where /3A = log( (1 _* )(1 _ A) ). ^ c R Anfinselli Science, 181, 223-230 (1973). 
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